- 
                Notifications
    You must be signed in to change notification settings 
- Fork 105
Replace QueueStateSync with QueueSync #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
838f06e    to
    5c4341b      
    Compare
  
    | I'm going to use this PR to more or less rewrite the public API of #76. | 
2e68088    to
    b87f9f1      
    Compare
  
    | BTW, the next decision should be whether to keep the methods in  | 
6ae196d    to
    65c816d      
    Compare
  
    | fn set_next_avail(&mut self, next_avail: u16); | ||
| } | ||
|  | ||
| /// Struct to maintain information and manipulate state of a virtio queue. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some usage scenarios using QueueStateSync for multi-threading. The idea is that: the guard object returned by QueueStateT::lock() can ensure invariance of the queue object for a series of queue operations. So if we want to execute a single operation on a queue, we may call QueueStateSync::method() directly. On the other hand, if we want to execute a series of operations on a queue in atomic mode, we get a guard object by QueueStateSync::lock() and then execute the operations against the guard object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What operations do you need? We can add them to QueueSync if needed, but I'd prefer to keep the external API as small as possible and force you to write lock() every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workload pattern may be abstracted as:
- a single thread wait for queue event and fetch available descriptor from the queue.
- the thread creates an asynchronous task for each available descriptor and dispatch async task to a multi-threading async engine.
- the async task does some IO operations to fulfill the service request.
- the async task eventually adds the descriptor to the used ring and notify guests.
On the other hand, much of our existing code is making use of the original queue interface, which is similar to QueueState and QueueStateSync. So it's appreciated to keep the flexibility for clients to choose among Queue, QueueState and QueueStateSync.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is to start small, and only add new functionality with examples of how it improves real-world code without affecting performance negatively (as is often the case with too small critical sections).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are trying to avoid the "too small critical sections" by QueueStateT::lock() and QueueStateGuard. But it's true that we need to be careful to use these interfaces for IO intensive devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you use lock()? It takes a &mut so in reality there's nothing to lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are taking a mutable reference to Arc<Mutex<QueueState>> and lock the underlying QueueState object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since an Arc is by definition providing only shared references, how are you getting a mutable one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example test case to use the QueueStateSync struct:
    #[test]
    fn test_queue_state_sync() {
        let mut q = QueueStateSync::new(0x1000);
        let mut q2 = q.clone();
        let q3 = q.clone();
        let barrier = Arc::new(Barrier::new(3));
        let b2 = barrier.clone();
        let b3 = barrier.clone();
        let t1 = std::thread::spawn(move || {
            {
                let guard = q2.lock();
                assert_eq!(guard.ready(), false);
            }
            b2.wait();
            b2.wait();
            {
                let guard = q2.lock();
                assert_eq!(guard.ready(), true);
            }
        });
        let t2 = std::thread::spawn(move || {
            assert_eq!(q3.ready(), false);
            b3.wait();
            b3.wait();
            assert_eq!(q3.ready(), true);
        });
        barrier.wait();
        q.set_ready(true);
        barrier.wait();
        t1.join().unwrap();
        t2.join().unwrap();
    }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the example. With the proposed API, q2 need not be mut anymore; q3 needs an explicit lock().
fa7bae8    to
    efd1d22      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is another major change in the way virtio queues are managed. At some point we have to stop breaking the API if we want projects to be able to use this crate (I know we try very hard to in Cloud Hypervisor).
The gain of changing the API is not entirely clear to me, and more than that, the part which binds the guest memory with the queue feels like one step too far for this interface.
I'd like to hear from other users (@alexandruag @slp) about this as well.
| The gain in changing the API is mostly to allow using shared references more easily, and using the same standard interior mutability pattern ( As an aside, using an  | 
We are using a `: GuestAddressSpace` trait bound for `AvailIter` and `DescriptorChain`, but the objects are actually only interested in the fact that `M` can dereference to a `GuestMemory` implementation. Signed-off-by: Alexandru Agache <[email protected]>
Signed-off-by: Alexandru Agache <[email protected]>
The extremely short critical sections do not make it possible to enforce any invariant on the virtqueue, making QueueStateSync more or less useless. Rip it out, the mutex abstraction will be introduced at the Queue level in the next commits. Signed-off-by: Paolo Bonzini <[email protected]>
QueueGuard encapsulates Queue during a sequence of accesses under the same critical section. A QueueGuard always uses the same GuestMemory for all operations, and can be extended to be a single critical section in the case of multithreaded VMMs. Signed-off-by: Paolo Bonzini <[email protected]>
This will allow to reuse the same tests for QueueState and QueueStateSync. The trait is _not_ part of the public API, which consists exclusively of acquire() and lock(). This is because with() is mostly useless for the multithreaded case, where you have a &self but not a &mut self. Signed-off-by: Paolo Bonzini <[email protected]>
Avoid constant repetition of "GuestMemoryMmap::<()>". Signed-off-by: Paolo Bonzini <[email protected]>
Make all accesses to the underlying QueueState go through the with() function. The function which will take care of locking when testing QueueSync. Signed-off-by: Paolo Bonzini <[email protected]>
Similar to Queue, QueueSync associates a GuestAddressSpace to a QueueState. The QueueState however is wrapped with reference counting and with a mutex. All access to the QueueSync object has to go through a QueueGuard, which takes care of locking the mutex. Signed-off-by: Paolo Bonzini <[email protected]>
The tests simply invoke the same code twice, once with a Queue and once with a QueueSync. test_queue_guard/test_queue_guard_sync are different, because one uses lock() and one uses acquire(). Suggested-by: Laura Loghin <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
| 
 It's my fault to introduce an  But it's really hard for us to make use of the  | 
| If you cannot use Queue, you can use a  But one point of this series is in fact to limit the cost of ArcSwap, because now you only pay the price of ArcSwap when you already have to lock a mutex (which is more expensive than getting an ArcSwap guard). | 
Signed-off-by: Paolo Bonzini <[email protected]>
| 
 I understand your concern. We're now in the process of figuring out what we actually want to publish to crates.io as an initial interface, so for the upcoming 2-3 weeks until we manage to get the interface clean and tidy, we're still going to probably introduce breaking changes. Once it is published it should be much easier because you can consume the version from crates.io. Are you already using this in Cloud Hypervisor? Could it be possible to temporarily pause the updates until we get the interface in a more stable state? | 
| I've been playing with this PR in the context of vhost-user-backend and virtiofsd-rs. In general, I think it makes the API more elegant, but I also think it'd be nicer if it was more flexible and not force the user to always go through  For instance, there might be cases in which the queue is going to be solely owned by an struct that wraps it, adding more functionality, and managing both locking and atomic_mem on its own ( I think one way to satisfy both a use case in which the queue is going to act as a completely independent entity and another in which is going to be wrapped inside another object, would be to make all methods from  That said, I also agree with @sboeuf that we should find an API that we all find acceptable and commit to it for  | 
| 
 Great to hear that you have done experiment with vhost-user-backend/virtiofsd, it's on my TODO list:) | 
| 
 Sure, I've just pushed my test branches: Those changes are made with the assumption that all methods in  | 
| @slp, can you try passing the  I can separate the guard object into a QueueReadGuard and a QueueWriteGuard too, if it's useful. Then we can even decide that Queue/QueueSync are unnecessary; but I'd really prefer if all users that need to access memory did go through  | 
| @bonzini In the context of  In a more general level, I'm starting to suspect the reason why we still haven't found an API for  | 
| 
 QueueGuard is flexible and it does not have to be tied to a lock guard. It can be tied to a simple  
 Right, so I'm trying to understand if QueueGuard alone provides the right boundary between not-too-much-policy and consistent access to the GuestAddressSpace. | 
| 
 That's what we have expected for a long time:) | 
| 
 @bonzini I see your point, and certainly the interface looks better this way. We still have one problem though, which I think is the same @jiangliu is facing. Let me illustrate it with an example (I've simplified the code for readability): In this scenario, if you acquire a  So far the only solution I've been able to come up with is being able to directly have  | 
| 
 IIUC that's because you're already borrowing mutably when building the QueueGuard: let queue = vring_state.get_queue_mut();
...
vring_state.signal_used_queue().unwrap();For the specific case of signal_used_queue(), which can be a method on  
 Yes, that would be a valid solution too. Not a great one because it pushes the responsibility out of vm-virtio, but I guess it's okay. | 
| 
 Hm... I guess there might be a way to make it work with  
 We can give users both options, documenting that the preferred one is consuming  @jiangliu @alexandruag what do you think? | 
| Maybe signal_used_queue() needs to be a trait that QueueGuard implements. Using some kind of smart pointer it should be possible to create a QueueGuard instance that holds a  | 
| 
 It would be great to provide raw access to QueueState. After all it's needed for vm snapshot/upgrading/migration. | 
| I agree that it's okay to give access to QueueState, but it's also important that QueueGuard be usable. | 
There were a couple of changes in rust-vmm#76 that are not yet tested. PR that covers that functionality is in review here: rust-vmm#111. Until that one is merged, the coverage needs to be decreased. Signed-off-by: Laura Loghin <[email protected]>
There were a couple of changes in #76 that are not yet tested. PR that covers that functionality is in review here: #111. Until that one is merged, the coverage needs to be decreased. Signed-off-by: Laura Loghin <[email protected]>
| Hi @bonzini , we have factored the  Current implementation works for Cloud Hypervisor, virtiofsd, nydus and dragonball, and there are proposed non-breaking enhancement for Firecracker. So is it feasible to publish v0.1 recently and postpone this PR for v0.2? | 
| Yes, I agree this is the best that can be done without breaking the API. | 
The multithreaded QueueStateSync introduced in #76 has two problems:
&mut self; this requires unnecessary cloning of the Queue or QueueStateSync objects in order to ensure that you have a mutable reference.The root issue here is that the locking should be done at the Queue level, because Queue already manages interior mutability of the
GuestAddressSpace; the queue can easily provide you with both the currentGuestMemoryof the address space and aQueueStateto operate on (including via anAvailIter, this time).On top of this, this pull request also adds tests for the new QueueSync, as @lauralt reminded during the discussion in #76 (I might add that listening to her, probably, would have made the deficiencies more obvious). In order to reuse the test code, a private API is added that is similar to #76's
lock(&mut self). This API is not public because it has all the defects listed above, but it is fine for tests because (unlike real-world code) they have easy access to a&mut QueueSync.